algorithmic probability
SuperARC: A Test for General and Super Intelligence Based on First Principles of Recursion Theory and Algorithmic Probability
Hernández-Espinosa, Alberto, Ozelim, Luan, Abrahão, Felipe S., Zenil, Hector
We introduce an open-ended test grounded in algorithmic probability that can avoid benchmark contamination in the quantitative evaluation of frontier models in the context of their Artificial General Intelligence (AGI) and Superintelligence (ASI) claims. Unlike other tests, this test does not rely on statistical compression methods (such as GZIP or LZW), which are more closely related to Shannon entropy than to Kolmogorov complexity. The test challenges aspects related to features of intelligence of fundamental nature such as synthesis and model creation in the context of inverse problems (generating new knowledge from observation). We argue that metrics based on model abstraction and optimal Bayesian inference for planning can provide a robust framework for testing intelligence, including natural intelligence (human and animal), narrow AI, AGI, and ASI. Our results show no clear evidence of LLM convergence towards a defined level of intelligence, particularly AGI or ASI. We found that LLM model versions tend to be fragile and incremental, as new versions may perform worse than older ones, with progress largely driven by the size of training data. The results were compared with a hybrid neurosymbolic approach that theoretically guarantees model convergence from optimal inference based on the principles of algorithmic probability and Kolmogorov complexity. The method outperforms LLMs in a proof-of-concept on short binary sequences. Our findings confirm suspicions regarding the fundamental limitations of LLMs, exposing them as systems optimised for the perception of mastery over human language. Progress among different LLM versions from the same developers was found to be inconsistent and limited, particularly in the absence of a solid symbolic counterpart.
- Europe > Belgium (0.04)
- South America > Brazil (0.04)
- North America > United States > Massachusetts > Suffolk County > Boston (0.04)
- (8 more...)
- Education (1.00)
- Health & Medicine > Pharmaceuticals & Biotechnology (0.92)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)
Decoding Geometric Properties in Non-Random Data from First Information-Theoretic Principles
Zenil, Hector, Abrahão, Felipe S.
Based on the principles of information theory, measure theory, and theoretical computer science, we introduce a univariate signal deconvolution method with a wide range of applications to coding theory, particularly in zero-knowledge one-way communication channels, such as in deciphering messages from unknown generating sources about which no prior knowledge is available and to which no return message can be sent. Our multidimensional space reconstruction method from an arbitrary received signal is proven to be agnostic vis-a-vis the encoding-decoding scheme, computation model, programming language, formal theory, the computable (or semi-computable) method of approximation to algorithmic complexity, and any arbitrarily chosen (computable) probability measure of the events. The method derives from the principles of an approach to Artificial General Intelligence capable of building a general-purpose model of models independent of any arbitrarily assumed prior probability distribution. We argue that this optimal and universal method of decoding non-random data has applications to signal processing, causal deconvolution, topological and geometric properties encoding, cryptography, and bio- and technosignature detection.
- North America > United States > New York > New York County > New York City (0.14)
- North America > Puerto Rico > Arecibo > Arecibo (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- (6 more...)
- Health & Medicine (0.45)
- Information Technology > Security & Privacy (0.34)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Spatial Reasoning (0.60)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.34)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)
Simplicity bias, algorithmic probability, and the random logistic map
Hamzi, Boumediene, Dingle, Kamaludin
Simplicity bias is an intriguing phenomenon prevalent in various input-output maps, characterized by a preference for simpler, more regular, or symmetric outputs. Notably, these maps typically feature high-probability outputs with simple patterns, whereas complex patterns are exponentially less probable. This bias has been extensively examined and attributed to principles derived from algorithmic information theory and algorithmic probability. In a significant advancement, it has been demonstrated that the renowned logistic map $x_{k+1}=\mu x_k(1-x_k)$, and other one-dimensional maps exhibit simplicity bias when conceptualized as input-output systems. Building upon this foundational work, our research delves into the manifestations of simplicity bias within the random logistic map, specifically focusing on scenarios involving additive noise. This investigation is driven by the overarching goal of formulating a comprehensive theory for the prediction and analysis of time series.Our primary contributions are multifaceted. We discover that simplicity bias is observable in the random logistic map for specific ranges of $\mu$ and noise magnitudes. Additionally, we find that this bias persists even with the introduction of small measurement noise, though it diminishes as noise levels increase. Our studies also revisit the phenomenon of noise-induced chaos, particularly when $\mu=3.83$, revealing its characteristics through complexity-probability plots. Intriguingly, we employ the logistic map to underscore a paradoxical aspect of data analysis: more data adhering to a consistent trend can occasionally lead to reduced confidence in extrapolation predictions, challenging conventional wisdom.We propose that adopting a probability-complexity perspective in analyzing dynamical systems could significantly enrich statistical learning theories related to series prediction.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- North America > United States > New York (0.04)
- (4 more...)
Optimal Spatial Deconvolution and Message Reconstruction from a Large Generative Model of Models
Zenil, Hector, Adams, Alyssa, Abrahão, Felipe S.
We introduce a univariate signal deconvolution method based on the principles of an approach to Artificial General Intelligence in order to build a general-purpose model of models independent of any arbitrarily assumed prior probability distribution. We investigate how non-random data may encode information about the physical properties, such as dimensions and length scales of the space in which a signal or message may have been originally encoded, embedded, or generated. Our multidimensional space reconstruction method is based on information theory and algorithmic probability, so that it is proven to be agnostic vis-a-vis the arbitrarily chosen encoding-decoding scheme, computable or semi-computable method of approximation to algorithmic complexity, and computational model. The results presented in this paper are useful for applications in coding theory, particularly in zero-knowledge one-way communication channels, such as in deciphering messages from unknown generating sources about which no prior knowledge is available and to which no return message can be sent. We argue that this method has the potential to be of great value in cryptography, signal processing, causal deconvolution, life and technosignature detection.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.28)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.28)
- North America > United States > New York > New York County > New York City (0.14)
- (12 more...)
- Information Technology > Artificial Intelligence > Natural Language > Generation (0.40)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.34)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)
Multiclass classification utilising an estimated algorithmic probability prior
Dingle, Kamaludin, Batlle, Pau, Owhadi, Houman
Methods of pattern recognition and machine learning are applied extensively in science, technology, and society. Hence, any advances in related theory may translate into large-scale impact. Here we explore how algorithmic information theory, especially algorithmic probability, may aid in a machine learning task. We study a multiclass supervised classification problem, namely learning the RNA molecule sequence-to-shape map, where the different possible shapes are taken to be the classes. The primary motivation for this work is a proof of concept example, where a concrete, well-motivated machine learning task can be aided by approximations to algorithmic probability. Our approach is based on directly estimating the class (i.e., shape) probabilities from shape complexities, and using the estimated probabilities as a prior in a Gaussian process learning problem. Naturally, with a large amount of training data, the prior has no significant influence on classification accuracy, but in the very small training data regime, we show that using the prior can substantially improve classification accuracy. To our knowledge, this work is one of the first to demonstrate how algorithmic probability can aid in a concrete, real-world, machine learning problem.
- Europe > Austria > Vienna (0.04)
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- North America > United States > New York (0.04)
- (4 more...)
- Health & Medicine > Pharmaceuticals & Biotechnology (0.67)
- Education > Focused Education > Special Education (0.44)
Algorithmic Probability of Large Datasets and the Simplicity Bubble Problem in Machine Learning
Abrahão, Felipe S., Zenil, Hector, Porto, Fabio, Wehmuth, Klaus
When mining large datasets in order to predict new data, limitations of the principles behind statistical machine learning pose a serious challenge not only to the Big Data deluge, but also to the traditional assumptions that data generating processes are biased toward low algorithmic complexity. Even when one assumes an underlying algorithmic-informational bias toward simplicity in finite dataset generators, we show that fully automated, with or without access to pseudo-random generators, computable learning algorithms, in particular those of statistical nature used in current approaches to machine learning (including deep learning), can always be deceived, naturally or artificially, by sufficiently large datasets. In particular, we demonstrate that, for every finite learning algorithm, there is a sufficiently large dataset size above which the algorithmic probability of an unpredictable deceiver is an upper bound (up to a multiplicative constant that only depends on the learning algorithm) for the algorithmic probability of any other larger dataset. In other words, very large and complex datasets are as likely to deceive learning algorithms into a "simplicity bubble" as any other particular dataset. These deceiving datasets guarantee that any prediction will diverge from the high-algorithmic-complexity globally optimal solution while converging toward the low-algorithmic-complexity locally optimal solution. We discuss the framework and empirical conditions for circumventing this deceptive phenomenon, moving away from statistical machine learning towards a stronger type of machine learning based on, or motivated by, the intrinsic power of algorithmic information theory and computability theory.
- North America > United States > New York > New York County > New York City (0.14)
- South America > Brazil (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- (3 more...)
A Computable Piece of Uncomputable Art whose Expansion May Explain the Universe in Software Space
The machine stops when it reaches a certain configuration (a combination of what it reads and its internal state). It is said that a Turing machine produces an output if the Turing machine halts, while the locations on the tape the machine has visited represent the output produced. The most remarkable idea advanced by Turing is his demonstration that there is an'a' machine that is able to read other'a' machines and behave as they would for an input s. In other words, Turing proved that it was not necessary to build a new machine for each different task; a single machine that could be reprogrammed sufficed. This erases the distinction between program and data, as well as between software and hardware, as one can always codify data as a program to be executed by another Turing machine and vice versa, just as one can always build a universal machine to execute any program and vice versa.
- North America > Canada > Alberta > Census Division No. 13 > Woodlands County (0.04)
- Asia > Singapore (0.04)
- North America > United States > Vermont (0.04)
- (7 more...)
Kolmogorov Regularization for Link Prediction
Flood, Paris D. L., Viñas, Ramon, Liò, Pietro
Link prediction in graphs is an important task in the fields of network science and machine learning. We propose a flexible means of regularization for link prediction based on an approximation of the Kolmogorov complexity of graphs. Informally, the Kolmogorov complexity of an object is the length of the shortest computer program that produces the object. Complex networks are often generated, in part, by simple mechanisms; for example, many citation networks and social networks are approximately scale-free and can be explained by preferential attachment. A preference for predicting graphs with simpler generating mechanisms motivates our choice of Kolmogorov complexity as a regularization term. Our method is differentiable, fast and compatible with recent advances in link prediction algorithms based on graph neural networks. We demonstrate the effectiveness of our regularization technique on a set of diverse real-world networks.
- North America > United States > Illinois > Cook County > Chicago (0.05)
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- Oceania > New Zealand > North Island > Auckland Region > Auckland (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Research Report (0.50)
- Overview (0.46)
Objective and Subjective Solomonoff Probabilities in Quantum Mechanics
Algorithmic probability has shown some promise in dealing with the probability problem in the Everett interpretation, since it provides an objective, single-case probability measure. Many find the Everettian cosmology to be overly extravagant, however, and algorithmic probability has also provided improved models of subjective probability and Bayesian reasoning. I attempt here to generalize algorithmic Everettianism to more Bayesian and subjectivist interpretations. I present a general framework for applying generative probability, of which algorithmic probability can be considered a special case. I apply this framework to two commonly vexing thought experiments that have immediate application to quantum probability: the Sleeping Beauty and Replicator experiments.
- North America > United States > New York (0.04)
- North America > Canada > Ontario > Toronto (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Coding-theorem Like Behaviour and Emergence of the Universal Distribution from Resource-bounded Algorithmic Probability
Zenil, Hector, Badillo, Liliana, Hernández-Orozco, Santiago, Hernández-Quiroz, Francisco
Previously referred to as `miraculous' in the scientific literature because of its powerful properties and its wide application as optimal solution to the problem of induction/inference, (approximations to) Algorithmic Probability (AP) and the associated Universal Distribution are (or should be) of the greatest importance in science. Here we investigate the emergence, the rates of emergence and convergence, and the Coding-theorem like behaviour of AP in Turing-subuniversal models of computation. We investigate empirical distributions of computing models in the Chomsky hierarchy. We introduce measures of algorithmic probability and algorithmic complexity based upon resource-bounded computation, in contrast to previously thoroughly investigated distributions produced from the output distribution of Turing machines. This approach allows for numerical approximations to algorithmic (Kolmogorov-Chaitin) complexity-based estimations at each of the levels of a computational hierarchy. We demonstrate that all these estimations are correlated in rank and that they converge both in rank and values as a function of computational power, despite fundamental differences between computational models. In the context of natural processes that operate below the Turing universal level because of finite resources and physical degradation, the investigation of natural biases stemming from algorithmic rules may shed light on the distribution of outcomes. We show that up to 60\% of the simplicity/complexity bias in distributions produced even by the weakest of the computational models can be accounted for by Algorithmic Probability in its approximation to the Universal Distribution.
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
- North America > Mexico (0.04)
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- (6 more...)